Beijing University of Posts and Telecommunications–SocialNetVis

VAST 2009 Challenge
Challenge 2 - Social Network and Geospatial

Authors and Affiliations:

e.g. Qi Ye, University of Posts and Telecommunications, yeqibupt@gmail.com [PRIMARY contact]
      Tian Zhu, University of Posts and Telecommunications,
zhutian.bupt@gmail.com
     
Deyong hu, University of Posts and Telecommunications, hudeyong@tseg.org
      Jian
Liu, University of Posts and Telecommunications, liujianbupt@gmail.com
      Chao Han,
University of Posts and Telecommunications, hanchaobupt@gmail.com
      Bin Wu, University of Posts and Telecommunications, wubin@bupt.eud.cn [Faculty advisor]

     
Bai Wang, University of Posts and Telecommunications, wangbai@bupt.edu.cn

Tool(s):

Taking the social network analysis and visual analytics approach, based on our graph visual analytical framework of JSNVA, we develop a tool called SocialNetVis in Java to analyze the massive Flitter user graph. First and foremost, to find out the typical social subgraph in the massive social network, we write two short programs based on the data structure in JSNVA. After the typical subgraph has been found, we can use SocialNetVis to explore the massive social network and validate our hypothesis. SocialNetVis  keeps three data structures for graphs: the raw graph which provides the structure of original graph; the subgraph contains a subgraph in the raw graph; the community graph is an abstract graph derived from the raw graph in which each vertex is a community and the edges indicate the relationships between communities. Fig. 1 is the primary user interface of SocialNetVis. SocialNetVis classifies the operations into the following steps: topological statistical analysis, community detection and visual analysis. In the topological statistical analysis step, we use different network algorithms to get the topological statistical properties of the raw graph and extract the subgraph from the raw graph by different filtering metrics. In the community detection step, users can extract the statistically significant dense subgraphs or communities from the original raw graph. In the visual analysis step, users can show the raw graph, subgraph and community graph in new network visualization frames, respectively. After get the typical social structure, by using SocialNetVis, we can explore the total structure or certain subgraphs of the network and verify our hypotheses efficiently.

 

Figure 1 the SocialNetVis System

Video:

 

SocialNetVis Video

 

ANSWERS:


MC2.1: Which of the two social structures, A or B, most closely match the scenario you have identified in the data?

A

MC2.2:  Provide the social network structure you have identified as a tab delimitated file. It should contain the employee, one or more handler, any middle folks, and the localized leader with their international contacts. What are the Flitter names of the persons involved? Please identify only key connections (not all single links for example) as well as any other nodes related to the scenario (if any) you may have discovered that were not described in the two scenarios A and B above. 

Flitter.txt


MC2.3:  Characterize the difference between your social network and the closest social structure you selected (A or B). If you include extra nodes please explain how they fit in to your scenario or analysis. 

To find out the typical social structure in the massive social network, we write two short programs based on graph data structure in the framework of JSNVA. We first explore statistical topological characters of the Flitter social network using our tool. In the social network there are 6000 Flitter users and 29876 edges. In the topological statistical analysis step, we find the degree distribution of the Flitter user network has an obvious heavy tail as most real world networks. There is only one component in the social network.

To find out the closest social structure in the Flitter user network, we mainly use the number of Flitter users’ neighbors given by the known conditions to get the employee, handlers, middle man and Fearless leader. As the Fearless Leader probably has a broad Flitter network (well over 100 links), we first get the vertices whose degrees are above 100. As the well known heavy tail phenomenon of degrees, there are only 24 vertices selected. After selecting the large degree vertices as potential Fearless Leaders, we try to find their spanning trees in 4 hops. We also set some metrics to filter uninterested vertices: each handler’s degree is between 30 and 40, the employee’s degree is between 30 and 50, in structure A the middle man’s degree is at most 6 (3 handlers, 1 Fearless leader and at most 2 others) and in structure B each middle man’s degree is at most 4 (1 handler, 1 Fearless leader and at most 2 others). The middle man’s degree is a very important metric. Without this metric, we can find many other similar social structures in structure B. By using all of these metrics, we write two programs in Java based on our framework JSNVA to find these structures. We find out only one subgraph that matches all these metrics in social structure A and no one matches the metrics in structure B.

By using these metrics to find typical social structure, as shown in Fig.2, we find the closest social structure is structure A. The employee’s ID is 100, and his degree is 40. The Fearless leader’s ID is 4, and his degree is 256. The middle man’s ID is 4994, and his degree is 5. The IDs of handlers are 194, 563, 261 and their degrees are 33, 37 and 31, respectively. As in the given condition “Boris (middle man) communicates with one or two others in the organization and no one else”, we find the middle man (4994) has another neighbor in the organization whose ID is 1612.  As shown in Fig 2, we get the subgraph of these vertices and there are no links between handlers. We also find out the international contracts of the Fearless Leader.

Figure 2 Closest Social Structure of the Organization


MC2.4:  How is your hypothesis about the social structure in Part 1 supported by the city locations of Flovania? What part(s), if any, did the role of geographical information play in the social network of part one? 

As shown in Fig. 3, the employee, handlers, middle man and Fearless Leader are in all Flovania. The employee is in Prounov which is a large city near the largest city Koul. All handlers are also in Prounov, so it may be convenient for them to communicate with the employee. The other neighbor of middle man is also in Prounov.  The middle man is in Kannvic which is a smaller city nearby Prounov. The Fearless Leader is in Kouvnic which is a mid-sized border city of Flovania. By linking Flitter IDs to Cities, we find there are 798 Filtter users in Kouvnic and there are 320 Fitter users in Kannvic. As the number of Filtter users are in proportion to the size of the city in the common cases, we believe that Kouvnic is larger than Kannvic. As shown in Fig. 4, we regard the information sent by employee is from Prounov to Kannvic, to Kouvnic and to other counties.

The given conditions tell us “A target and handler may be in a large city, a middleman might be in nearby smaller locations. A leadership role, such as the one of Fearless Leader, would likely require a presence in a larger city”. The employee and handlers are all in Prounov. The middle man is in Kannvic nearby Prounov. The Fearless Leader is in Kouvnic which is a larger border city. All the geographical information is in accordance with the given conditions.

Figure 3 Flitter Users’ Cities in the Closest Social Structure of the Organization

Figure 4 Information from the Employee to the Fearless Leader’s International Contacts


MC2.5:  In general, how are the Flitter users dispersed throughout the cities of this challenge? Which of the surrounding countries may have ties to this criminal operation?  Why might some be of more significant concern than others?

Table 1 shows how the Flitter users dispersed through out the cities.

Koul

1998

Prounov

1707

Kouvnic

798

Kannvic

320

Solvenz

210

Sresk

147

Otello

147

Pasko

147

Ryzkland

142

Solank

135

Transpasko

126

Tulamuk

123

Table 1 the Number of Filtter Users in Cities

As the Fearless leader is in Kouvnic and he has the interaction Flitter links with the people in Tulamuk, Otello and Transpasko, we think Trium, Posana and Transak may all have ties to this criminal operation. We think Trium may be of more significant concern than other countries, as the Fearless Leader is in Kouvnic which is a border city closest to Trium. It may be convenient for him to go abroad to Trium than to other countries.